Sequences , Patterns and Coincidences
نویسنده
چکیده
This article provides a contemporary exposition at a moderately quantitative level of the distribution theory associated with sequences and patterns in iid multinomial trials, the birthday problem, and the matching problem. The section on patterns includes the classic distribution theory for waiting time for runs and more general patterns, and their means and moments. It also includes the central limit theorem and a.s. properties of the longest run, and modern applications to DNA sequence alignment. The section on birthdays reviews the Poisson approximation in the classical birthday problem, with new bounds on the total variation distance. It also includes a number of variants of the classic birthday problem, including the case of unequal probabilities, similar triplets, and the Bayesian version with Dirichlet priors on birthday probabilities. A new problem called the strong birthday problem with application to criminology is introduced. The section on matching covers the Poisson asymptotics, errors of Poisson approximations, and some variants such as near matches. It also briefly reviews the recent extraordinary developments in the area of longest increasing subsequences in random permutations. The article provides a large number of examples, many not well known, to help a reader have a feeling for these questions at an intuitive level.
منابع مشابه
High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملRemodified Bessel Functions via Coincidences and Near Coincidences
By considering a particular probabilistic scenario associated with coincidences, we are led to a family of functions akin to the modified Bessel function of the first kind. These are in turn solutions to a certain family of linear differential equations possessing structural similarities to the modified Bessel differential equation. The Stirling number triangle of the second kind arises quite n...
متن کاملSydney Basin
This paper observes and compares patterns of above average rates of lung cancer occurrences in the Sydney metropolitan area with typical patterns of air pollution circulation within the Sydney basin, as well as with the locations of primary source and ‘sink’ areas of toxic air emissions. It establishes that there is a strong coincidence, which is unlikely be entirely due to the demographics of ...
متن کاملIRF and ISRF Sequences and their Anti-Pedagogical Value
Initiation, Response, and Feedback(IRF) sequences are the most frequent interaction network in any classroom contexts. IRF sequences have been examined profusely in previous studies and were reported to be negatively correlated with participation opportunities (Kasper, 2006; Cazden, 2001; Ellis, 1994).In all these studies, all contingent factors of any classroom context which might influence in...
متن کاملMolecular typing of avian Escherichia coli isolates by enterobacterial repetitive intergenic consensus sequences-polymerase chain reaction (ERIC-PCR)
BACKGROUND: Colibacillosis is one of the most economically important diseases of poultry worldwide. OBJECTIVES: This study was conducted to examine the clonal relatedness and typing of 95 avian Escherichia coli isolates by ERIC-PCR. METHODS: Sixty-three E. coli isolates from two common manifestations of colibacillosis (yolk sac infection and colisepticemia) and 32 isolates from feces of apparen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004